The Third International Chinese Language Processing Bakeoff: Word Segmentation and Named Entity Recognition

نویسنده

  • Gina-Anne Levow
چکیده

The Third International Chinese Language Processing Bakeoff was held in Spring 2006 to assess the state of the art in two important tasks: word segmentation and named entity recognition. Twenty-nine groups submitted result sets in the two tasks across two tracks and a total of five corpora. We found strong results in both tasks as well as continuing challenges.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging

The Fourth International Chinese Language Processing Bakeoff was held in 2007 to assess the state of the art in three important tasks: Chinese word segmentation, named entity recognition and Chinese POS tagging. Twenty-eight groups submitted result sets in the three tasks across two tracks and a total of seven corpora. Strong results have been found in all the tasks as well as continuing challe...

متن کامل

Character Language Models for Chinese Word Segmentation and Named Entity Recognition

We describe the application of the LingPipe toolkit (Alias-i 2006) to Chinese word segmentation and named entity recognition. We provide results for the third SIGHAN Chinese language processing bakeoff (Levow 2006). F1 measures on the best performing corpora were .972 for word segmentation and .855 for person/location/organization named-

متن کامل

A Pragmatic Chinese Word Segmentation System

This paper presents our work for participation in the Third International Chinese Word Segmentation Bakeoff. We apply several processing approaches according to the corresponding sub-tasks, which are exhibited in real natural language. In our system, Trigram model with smoothing algorithm is the core module in word segmentation, and Maximum Entropy model is the basic model in Named Entity Recog...

متن کامل

NetEase Automatic Chinese Word Segmentation

This document analyses the bakeoff results from NetEase Co. in the SIGHAN5 Word Segmentation Task and Named Entity Recognition Task. The NetEase WS system is designed to facilitate research in natural language processing and information retrieval. It supports Chinese and English word segmentation, Chinese named entity recognition, Chinese part of speech tagging and phrase conglutination. Evalua...

متن کامل

An Improved CRF based Chinese Language Processing System for SIGHAN Bakeoff 2007

This paper describes three systems: the Chinese word segmentation (WS) system, the named entity recognition (NER) system and the Part-of-Speech tagging (POS) system, which are submitted to the Fourth International Chinese Language Processing Bakeoff. Here, Conditional Random Fields (CRFs) are employed as the primary models. For the WS and NER tracks, the ngram language model is incorporated in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006